Goto

Collaborating Authors

 polynomial kernel









Linear and Kernel Classification in the Streaming Model: Improved Bounds for Heavy Hitters

Neural Information Processing Systems

We study linear and kernel classification in the streaming model. For linear classification, we improve upon the algorithm of (Tai, et al. 2018), which solves the $\ell_1$ point query problem on the optimal weight vector $w_* \in \mathbb{R}^d$ in sublinear space. We first give an algorithm solving the more difficult $\ell_2$ point query problem on $w_*$, also in sublinear space. We also give an algorithm which solves the $\ell_2$ heavy hitter problem on $w_*$, in sublinear space and running time. Finally, we give an algorithm which can $\textit{deterministically}$ solve the $\ell_1$ point query problem on $w_*$, with sublinear space improving upon that of (Tai, et al. 2018). For kernel classification, if $w_* \in \mathbb{R}^{d^p}$ is the optimal weight vector classifying points in the stream according to their $p^{th}$-degree polynomial kernel, then we give an algorithm solving the $\ell_2$ point query problem on $w_*$ in $\text{poly}(\frac{p \log d}{\varepsilon})$ space, and an algorithm solving the $\ell_2$ heavy hitter problem in $\text{poly}(\frac{p \log d}{\varepsilon})$ space and running time. Note that our space and running time are polynomial in $p$, making our algorithms well-suited to high-degree polynomial kernels and the Gaussian kernel (approximated by the polynomial kernel of degree $p = \Theta(\log T)$).


Is Input Sparsity Time Possible for Kernel Low-Rank Approximation?

Neural Information Processing Systems

Low-rank approximation is a common tool used to accelerate kernel methods: the n n kernel matrix K is approximated via a rank-k matrix K which can be stored in much less space and processed more quickly. In this work we study the limits of computationally efficient low-rank kernel approximation.


A Simple Approach to Automated Spectral Clustering Appendices Jicong Fan 1, 2, Yiheng T u

Neural Information Processing Systems

K, when n is large (e.g. The time complexity is O ( kτn). We have the following result. It shows that when two data points in X, e.g. Hence KLSR with Gaussian kernel utilizes local information to enhance C . The algorithm of AutoSC-GD with only LSR and KLSR is shown in Algorithm 1.